Used Cars Prediction Analysis¶

There are numerous aspects to take into consideration while purchasing a car – the main being should you buy a new or a used car. If you are trying to manage your finances wisely, opting for a pre-owned car would be a wise decision. Though the idea of purchasing a new car may sound tempting, the quick rate of depreciation, higher price, and greater insurance, among others, do not work in the favor of new cars.

Importing necessary libraries¶

Data Preprocessing¶

Unnamed: 0 Name Location Year Kilometers_Driven Fuel_Type Transmission Owner_Type Mileage Engine Power Seats New_Price
0 0 Maruti Alto K10 LXI CNG Delhi 2014 40929 CNG Manual First 32.26 km/kg 998 CC 58.2 bhp 4.0 NaN
1 1 Maruti Alto 800 2016-2019 LXI Coimbatore 2013 54493 Petrol Manual Second 24.7 kmpl 796 CC 47.3 bhp 5.0 NaN
2 2 Toyota Innova Crysta Touring Sport 2.4 MT Mumbai 2017 34000 Diesel Manual First 13.68 kmpl 2393 CC 147.8 bhp 7.0 25.27 Lakh
3 3 Toyota Etios Liva GD Hyderabad 2012 139000 Diesel Manual First 23.59 kmpl 1364 CC null bhp 5.0 NaN
4 4 Hyundai i20 Magna Mumbai 2014 29000 Petrol Manual First 18.5 kmpl 1197 CC 82.85 bhp 5.0 NaN
Unnamed: 0 Name Location Year Kilometers_Driven Fuel_Type Transmission Owner_Type Mileage Engine Power Seats New_Price Price
0 0 Maruti Wagon R LXI CNG Mumbai 2010 72000 CNG Manual First 26.6 km/kg 998 CC 58.16 bhp 5.0 NaN 1.75
1 1 Hyundai Creta 1.6 CRDi SX Option Pune 2015 41000 Diesel Manual First 19.67 kmpl 1582 CC 126.2 bhp 5.0 NaN 12.50
2 2 Honda Jazz V Chennai 2011 46000 Petrol Manual First 18.2 kmpl 1199 CC 88.7 bhp 5.0 8.61 Lakh 4.50
3 3 Maruti Ertiga VDI Chennai 2012 87000 Diesel Manual First 20.77 kmpl 1248 CC 88.76 bhp 7.0 NaN 6.00
4 4 Audi A4 New 2.0 TDI Multitronic Coimbatore 2013 40670 Diesel Automatic Second 15.2 kmpl 1968 CC 140.8 bhp 5.0 NaN 17.74
Unnamed: 0             int64
Name                  object
Location              object
Year                   int64
Kilometers_Driven      int64
Fuel_Type             object
Transmission          object
Owner_Type            object
Mileage               object
Engine                object
Power                 object
Seats                float64
New_Price             object
Price                float64
dtype: object
Unnamed: 0             int64
Name                  object
Location              object
Year                   int64
Kilometers_Driven      int64
Fuel_Type             object
Transmission          object
Owner_Type            object
Mileage               object
Engine                object
Power                 object
Seats                float64
New_Price             object
dtype: object
Rows in dataset are : 6019 
Columns in dataset are : 14
Rows in test dataset are : 1234 
Columns in test dataset are : 13
Unnamed: 0 Year Kilometers_Driven Seats Price
count 6019.000000 6019.000000 6.019000e+03 5977.000000 6019.000000
mean 3009.000000 2013.358199 5.873838e+04 5.278735 9.479468
std 1737.679967 3.269742 9.126884e+04 0.808840 11.187917
min 0.000000 1998.000000 1.710000e+02 0.000000 0.440000
25% 1504.500000 2011.000000 3.400000e+04 5.000000 3.500000
50% 3009.000000 2014.000000 5.300000e+04 5.000000 5.640000
75% 4513.500000 2016.000000 7.300000e+04 5.000000 9.950000
max 6018.000000 2019.000000 6.500000e+06 10.000000 160.000000
Unnamed: 0 Year Kilometers_Driven Seats
count 1234.000000 1234.000000 1234.000000 1223.000000
mean 616.500000 2013.400324 58507.288493 5.284546
std 356.369424 3.179700 35598.702098 0.825622
min 0.000000 1996.000000 1000.000000 2.000000
25% 308.250000 2011.000000 34000.000000 5.000000
50% 616.500000 2014.000000 54572.500000 5.000000
75% 924.750000 2016.000000 75000.000000 5.000000
max 1233.000000 2019.000000 350000.000000 10.000000
Unnamed: 0              0
Name                    0
Location                0
Year                    0
Kilometers_Driven       0
Fuel_Type               0
Transmission            0
Owner_Type              0
Mileage                 2
Engine                 36
Power                  36
Seats                  42
New_Price            5195
Price                   0
dtype: int64
Missing values in first list: {'Volkswagen CrossPolo 1.2 TDI', 'Mahindra KUV 100 mFALCON D75 K6 5str AW', 'Mercedes-Benz E-Class E240 V6 AT', 'Mahindra Scorpio VLS 2.2 mHawk', 'Honda Accord 2001-2003 2.3 VTI L MT', 'Skoda Superb Petrol Ambition', 'Skoda Laura 1.8 TSI Ambition', 'Land Rover Freelander 2 S Business Edition', 'Honda Civic 2010-2013 1.8 V AT', 'Skoda Rapid Ultima 1.6 TDI Ambition Plus', 'Hyundai Santro LS zipDrive Euro I', 'Toyota Land Cruiser Prado VX L', 'Hyundai Creta 1.6 VTVT Base', 'Renault Lodgy 110PS RxL', 'Hyundai i20 1.4 Asta AT (O) with Sunroof', 'Volkswagen Vento 1.5 TDI Highline Plus', 'Maruti Swift VVT ZXI', 'Mahindra KUV 100 D75 K8 5Str', 'Hindustan Motors Contessa 2.0 DSL', 'Nissan Terrano XE 85 PS', 'Hyundai Verna Transform SX VGT CRDi BS III', 'Volkswagen Jetta 2007-2011 1.6 Trendline', 'Hyundai EON 1.0 Era Plus', 'Toyota Corolla Altis GL', 'Hyundai Elantra GT', 'Mercedes-Benz S Class 2005 2013 320 L', 'Audi Q5 2008-2012 3.0 TDI Quattro', 'Maruti 800 DX', 'Fiat Punto 1.4 Emotion', 'Honda Mobilio V i VTEC', 'Mahindra Bolero SLX', 'Jaguar XF 2.0 Petrol Portfolio', 'Tata Tiago 1.05 Revotorq XT Option', 'BMW 5 Series 520d Sedan', 'Hyundai Accent GLX', 'Honda Amaze VX CVT i-VTEC', 'Chevrolet Enjoy Petrol LTZ 7 Seater', 'Ford Freestyle Titanium Plus Diesel', 'Nissan 370Z AT', 'Tata Indica Vista Aqua 1.2 Safire', 'Mitsubishi Pajero Sport 4X2 AT', 'Mahindra KUV 100 mFALCON G80 K4 5str', 'Mercedes-Benz A Class Edition 1', 'Tata Sumo EX 10/7 Str BSII', 'Mahindra Xylo E9', 'Maruti Vitara Brezza ZDi AMT', 'Chevrolet Spark 1.0 PS', 'Honda City i DTec VX Option BL', 'Toyota Etios Cross 1.2L G', 'Volkswagen Vento 1.5 TDI Highline Plus AT', 'OpelCorsa 1.4Gsi', 'Chevrolet Enjoy 1.4 LTZ 8', 'Maruti Ciaz VDi Option SHVS', 'Maruti Vitara Brezza ZDi Plus AMT', 'Mercedes-Benz E-Class 250 D W 124', 'Mahindra Bolero Power Plus ZLX', 'Tata Tiago AMT 1.2 Revotron XTA', 'Hyundai i20 2015-2017 Magna Optional 1.4 CRDi', 'Maruti Ertiga VXI Petrol', 'Honda BR-V i-DTEC S MT', 'Honda BR-V i-VTEC VX MT', 'Mahindra Scorpio VLX 2WD BSIII', 'Tata Indica V2 DiCOR DLG BS-III', 'Ford Ikon 1.4 ZXi', 'Volvo S60 D5 Kinetic', 'Hyundai Verna Transform VTVT with Audio', 'Chevrolet Sail Hatchback 1.2', 'Toyota Etios Liva VD', 'Toyota Innova 2.0 V', 'Fiat Avventura FIRE Dynamic', 'Maruti Swift AMT ZXI', 'Honda CR-V Diesel', 'BMW 7 Series 740i Sedan', 'Fiat Abarth 595 Competizione', 'Honda Jazz VX CVT', 'Ford Classic 1.4 Duratorq CLXI', 'Skoda Laura L and K MT', 'Maruti Swift 1.3 VXi', 'Tata Indica Vista Aqua TDI BSIII', 'Ford Fiesta Classic 1.6 Duratec LXI', 'Land Rover Discovery 4 TDV6 Auto Diesel', 'Maruti Ignis 1.2 AMT Delta', 'Toyota Innova 2.5 GX 8 STR', 'Chevrolet Enjoy 1.3 TCDi LTZ 7', 'Hyundai EON 1.0 Kappa Magna Plus', 'Hyundai Creta 1.6 SX Diesel', 'Skoda Octavia 2.0 TDI MT Style', 'Fiat Punto EVO 1.3 Emotion', 'Maruti Ciaz VXi', 'Hyundai Elantra SX AT', 'Fiat Linea Classic 1.3 Multijet', 'Hyundai i20 new Sportz AT 1.4', 'Bentley Flying Spur W12', 'BMW 3 Series GT 320d Sport Line', 'Skoda Laura 1.9 TDI MT Elegance', 'Tata Tigor 1.2 Revotron XZ Option', 'Honda City ZX VTEC Plus', 'Mercedes-Benz GLA Class 220 d 4MATIC', 'Tata Indica Vista Quadrajet LX', 'Maruti SX4 ZXI AT', 'Maruti Alto XCITE', 'Tata Indica Vista Terra 1.2 Safire BS IV', 'Hyundai Verna 1.4 CX', 'Mahindra TUV 300 2015-2019 T8 AMT', 'Mahindra Scorpio S10 8 Seater', 'Hyundai Xcent 1.2 CRDi SX', 'Mahindra Thar 4X4', 'Mahindra Scorpio VLX Special Edition BS-IV', 'Renault Pulse RxZ', 'Hyundai Santro Xing XG AT eRLX Euro III', 'Land Rover Discovery 4 SDV6 SE', 'Maruti Wagon R VXI AMT Opt', 'Maruti Ritz VDi ABS', 'Hyundai Tucson 2.0 e-VGT 4WD AT GLS', 'Mahindra Xylo H9', 'Ford Endeavour 3.0L AT 4x2', 'Toyota Innova Crysta Touring Sport 2.4 MT', 'Nissan Teana XL', 'Honda Jazz 2020 Petrol', 'Hyundai Accent Executive LPG', 'Nissan Micra XL CVT', 'Toyota Camry MT with Moonroof', 'Tata Indica Vista Terra Quadrajet 1.3L BS IV', 'Mercedes-Benz CLA 45 AMG', 'Toyota Etios Liva Diesel TRD Sportivo', 'Mercedes-Benz B Class B180 Sports', 'BMW X3 2.5si', 'Isuzu MU 7 4x2 HIPACK', 'Toyota Etios Liva 1.4 VXD', 'Volkswagen Vento 1.6 Trendline', 'Hyundai Sonata Embera 2.4L MT', 'Honda BRV i-DTEC V MT', 'Chevrolet Enjoy TCDi LS 7 Seater', 'Hyundai Elite i20 Magna Plus', 'Ford EcoSport 1.5 Petrol Ambiente', 'Renault Koleos 4X2 MT', 'Fiat Linea Dynamic', 'Datsun GO T Petrol', 'Volkswagen Polo ALLSTAR 1.2 MPI', 'Mahindra Verito Vibe 1.5 dCi D6', 'Volkswagen Vento 1.2 TSI Comfortline AT', 'Mahindra Scorpio SLX 2.6 Turbo 8 Str', 'Tata Manza Club Class Safire90 LX', 'Maruti A-Star Zxi', 'Mahindra TUV 300 P4', 'Hyundai Creta 1.6 SX Automatic', 'BMW 5 Series 530i Sport Line', 'Ford Fiesta Classic 1.6 SXI Duratec', 'Land Rover Range Rover HSE', 'Mahindra KUV 100 mFALCON D75 K2', 'Toyota Innova 2.5 LE 2014 Diesel 8 Seater', 'Fiat Avventura Urban Cross 1.3 Multijet Emotion', 'Hyundai i20 Active SX Diesel', 'Maruti Celerio X VXI Option', 'Fiat Grande Punto 1.2 Emotion', 'Maruti Versa DX2', 'Hyundai Santro Xing GLS CNG', 'Renault Duster 85PS Diesel RxZ', 'Honda Amaze E i-DTEC', 'Jeep Compass 1.4 Sport', 'Audi Q3 30 TDI S Edition', 'Mercedes-Benz B Class B180 Sport', 'Mahindra KUV 100 G80 K4 Plus 5Str', 'Ford Fiesta 1.4 SXI Duratorq', 'Hyundai i20 2015-2017 1.4 CRDi Sportz', 'Honda WRV i-DTEC VX', 'Honda Civic 2010-2013 1.8 S MT Inspire', 'BMW 7 Series 730Ld DPE Signature'}
False
Missing values in first list: {'Isuzu MU', 'Nissan 370Z', 'Bentley Flying', 'Toyota Land', 'Hindustan Motors', 'Fiat Abarth', 'OpelCorsa 1.4Gsi'}
Missing values in first list: set()
Unnamed: 0           0
Name                 0
Location             0
Year                 0
Kilometers_Driven    0
Fuel_Type            0
Transmission         0
Owner_Type           0
Mileage              0
Engine               0
Power                0
Seats                0
Price                0
Cars                 0
dtype: int64
Unnamed: 0           0
Name                 0
Location             0
Year                 0
Kilometers_Driven    0
Fuel_Type            0
Transmission         0
Owner_Type           0
Mileage              0
Engine               0
Power                0
Seats                0
Price                0
Cars                 0
dtype: int64
Name                 0
Location             0
Year                 0
Kilometers_Driven    0
Fuel_Type            0
Transmission         0
Owner_Type           0
Mileage              0
Engine               0
Power                0
Seats                0
Cars                 0
dtype: int64
Name                  object
Location              object
Year                   int64
Kilometers_Driven      int64
Fuel_Type             object
Transmission          object
Owner_Type            object
Mileage              float64
Engine               float64
Power                float64
Seats                float64
Cars                  object
dtype: object

Data Analysis¶

Price                1.000000
Power                0.769351
Engine               0.659117
Year                 0.305800
Seats                0.052262
Kilometers_Driven   -0.011263
Mileage             -0.313877
Name: Price, dtype: float64

Conclusion- According to the stats for choosing a necessary and optimum used car, a customer will prefer price at the first place and mileage at the last accordingly. An average typical second hand car customer prefers decent price at its first place.

[<matplotlib.lines.Line2D at 0x208cc500e80>]

Conclusion- Converted the value of Price to Log(Price) for a good solution to have a more normal visualization of the distribution of the Price.

72.3%27.3%0.347%0.0438%
DieselPetrolCNGLPG
plotly-logomark

Conclusion- The above pie chart indicates the price of particular fuel engines(diesel, petrol, CNG, LPG) Also it indicates that the market price of diesel engines is more as compared to other fuel type engines. Also diesel users are greater in market compared to others as it gives better mileage.

Conclusion- According to the plot, the customers using automatic transmission mode vehicles are increasing rapidly in consecutive years.

Conclusion- According to the plot, the customers using diesel driven vehicles are increasing rapidly in consecutive years.

CNGDieselPetrolLPG020406080100120140160
TransmissionManualAutomaticFuel_TypePrice
plotly-logomark

Conclusion- From graph it is clear that in CNG and LPG driven cars only manual mode of transmission is available whereas automatic mode of transmission leads in diesel and petrol driven cars(disesel being the most used).

Conclusion- The graph clearly indicates that people prefer Manual mode of Transmission over Automatic one

Model fitting¶

model Root Mean Squared Error Accuracy on Traing set Accuracy on Testing set
3 MLPRegressor 209.405833 0.678774 0.634821
4 AdaBoostRegressor 149.27056 0.828892 0.814443
0 DecisionTreeRegressor 113.561756 0.999993 0.892603
2 RandomForestRegressor 84.187587 0.991894 0.940977
5 ExtraTreesRegressor 81.025873 0.999993 0.945327
1 XGBRegressor 74.815814 0.994635 0.953386
Car_id Price
0 0 153.78
1 1 118.42
2 2 942.82
3 3 155.14
4 4 283.97

Observation- The above model displays prediction of the car price for respective specifications given in the feature1 array.

Additional Analysis¶

array(['Mumbai', 'Pune', 'Chennai', 'Coimbatore', 'Hyderabad', 'Jaipur',
       'Kochi', 'Kolkata', 'Delhi', 'Bangalore', 'Ahmedabad'],
      dtype=object)
array(['Maruti Wagon', 'Hyundai Creta', 'Honda Jazz', 'Maruti Ertiga',
       'Audi A4', 'Hyundai EON', 'Nissan Micra', 'Toyota Innova',
       'Volkswagen Vento', 'Tata Indica', 'Maruti Ciaz', 'Honda City',
       'Maruti Swift', 'Land Rover', 'Mitsubishi Pajero', 'Honda Amaze',
       'Renault Duster', 'Mercedes-Benz New', 'BMW 3', 'Maruti S',
       'Audi A6', 'Hyundai i20', 'Maruti Alto', 'Honda WRV',
       'Toyota Corolla', 'Mahindra Ssangyong', 'Maruti Vitara',
       'Mahindra KUV', 'Mercedes-Benz M-Class', 'Volkswagen Polo',
       'Tata Nano', 'Hyundai Elantra', 'Hyundai Xcent', 'Mahindra Thar',
       'Hyundai Grand', 'Renault KWID', 'Hyundai i10', 'Nissan X-Trail',
       'Maruti Zen', 'Ford Figo', 'Mercedes-Benz C-Class',
       'Porsche Cayenne', 'Mahindra XUV500', 'Nissan Terrano',
       'Honda Brio', 'Ford Fiesta', 'Hyundai Santro', 'Tata Zest',
       'Maruti Ritz', 'BMW 5', 'Toyota Fortuner', 'Ford Ecosport',
       'Hyundai Verna', 'Datsun GO', 'Maruti Omni', 'Toyota Etios',
       'Jaguar XF', 'Maruti Eeco', 'Honda Civic', 'Volvo V40',
       'Mercedes-Benz B', 'Mahindra Scorpio', 'Honda CR-V',
       'Mercedes-Benz SLC', 'BMW 1', 'Chevrolet Beat', 'Skoda Rapid',
       'Audi RS5', 'Mercedes-Benz S', 'Skoda Superb', 'BMW X5',
       'Mercedes-Benz GLC', 'Mini Countryman', 'Chevrolet Optra',
       'Renault Lodgy', 'Mercedes-Benz E-Class', 'Maruti Baleno',
       'Skoda Laura', 'Mahindra NuvoSport', 'Skoda Fabia', 'Tata Indigo',
       'Audi Q3', 'Skoda Octavia', 'Audi A8', 'Mahindra Verito',
       'Mini Cooper', 'Hyundai Santa', 'BMW X1', 'Hyundai Accent',
       'Hyundai Tucson', 'Mercedes-Benz GLE', 'Maruti A-Star',
       'Fiat Grande', 'BMW X3', 'Ford EcoSport', 'Audi Q7',
       'Volkswagen Jetta', 'Mercedes-Benz GLA', 'Maruti Celerio',
       'Tata Sumo', 'Honda Accord', 'BMW 6', 'Tata Manza',
       'Chevrolet Spark', 'Mini Clubman', 'Nissan Teana', 'Maruti 800',
       'Honda BRV', 'Jaguar XE', 'Tata Xenon', 'Audi A3',
       'Mercedes-Benz GL-Class', 'Honda BR-V', 'Volvo S80',
       'Renault Captur', 'Chevrolet Enjoy', 'Mahindra Bolero', 'Audi Q5',
       'Mitsubishi Cedia', 'Maruti S-Cross', 'Skoda Yeti',
       'Ford Endeavour', 'Mercedes-Benz GLS', 'Mercedes-Benz A',
       'Maruti SX4', 'Toyota Camry', 'Honda Mobilio', 'Fiat Linea',
       'Audi TT', 'Mahindra Renault', 'Jeep Compass', 'Ford Ikon',
       'Chevrolet Sail', 'Mahindra Quanto', 'Chevrolet Aveo',
       'Mahindra Xylo', 'Maruti Esteem', 'Tata Safari', 'Maruti Ignis',
       'Jaguar XJ', 'Nissan Sunny', 'Mercedes-Benz SLK-Class',
       'Volkswagen Passat', 'Maruti Dzire', 'Chevrolet Cruze',
       'Renault Koleos', 'Toyota Qualis', 'Volkswagen Ameo',
       'Maruti Grand', 'Datsun redi-GO', 'Smart Fortwo',
       'Mitsubishi Outlander', 'Porsche Cayman', 'Mercedes-Benz CLA',
       'Volvo XC60', 'Tata New', 'Porsche Boxster', 'Mahindra XUV300',
       'Tata Hexa', 'Tata Tiago', 'BMW 7', 'Fiat Avventura', 'Tata Tigor',
       'Volvo S60', 'Ambassador Classic', 'Volkswagen Beetle',
       'Fiat Petra', 'Hyundai Getz', 'Audi A7', 'Hyundai Elite',
       'Ford Aspire', 'Volkswagen Tiguan', 'Chevrolet Captiva',
       'Fiat Punto', 'Mahindra TUV', 'BMW X6', 'Tata Bolt',
       'Nissan Evalia', 'Renault Scala', 'Mahindra Jeep',
       'Hyundai Sonata', 'Ford Freestyle', 'Mahindra Logan',
       'Chevrolet Tavera', 'Volvo XC90', 'Renault Pulse',
       'Mitsubishi Montero', 'Porsche Panamera', 'Volkswagen CrossPolo',
       'Renault Fluence', 'Tata Venture', 'Tata Nexon', 'Isuzu MUX',
       'Toyota Platinum', 'Mercedes-Benz R-Class',
       'Mercedes-Benz CLS-Class', 'ISUZU D-MAX', 'Mercedes-Benz S-Class',
       'Mitsubishi Lancer', 'Ford Classic', 'Datsun Redi', 'Ford Mustang',
       'Ford Fusion', 'Fiat Siena', 'Maruti 1000',
       'Mercedes-Benz SL-Class', 'BMW Z4', 'Force One', 'Maruti Versa',
       'Honda WR-V', 'Bentley Continental', 'Lamborghini Gallardo',
       'Jaguar F'], dtype=object)
Location
Ahmedabad     223
Bangalore     353
Chennai       490
Coimbatore    634
Delhi         549
Hyderabad     741
Jaipur        410
Kochi         648
Kolkata       530
Mumbai        784
Pune          613
Name: Cars, dtype: int64
[<matplotlib.lines.Line2D at 0x208cd94c940>]

Observation- From above data, we can observe that Mumbai and Hyderabad has maximum number of second hand car users which is our target audience.

Location
Ahmedabad           Volvo XC60
Bangalore            Volvo V40
Chennai              Volvo S80
Coimbatore           Volvo S60
Delhi                Volvo S60
Hyderabad           Volvo XC60
Jaipur        Volkswagen Vento
Kochi               Volvo XC90
Kolkata       Volkswagen Vento
Mumbai               Volvo S60
Pune                Volvo XC60
Name: Cars, dtype: object

Conclusion- The above data provides information that the respective corresponding car models is of the highest demand in that particular city. From this result we can conclude that since Hyderabad and Mumbai have the highest number of second hand car users thus the availability of their respective cars ie. Volvo XC60 and Volvo S60 respectively are the target cars with maximum number of selling units. Similarly, the selling units of the corresponding cars is more in their respective locations. Hence it can be the our main selling main point.

Conclusion - This is a powerbi report comparing various columns of the data provided giving clear analysis of relation among themselves.

Prediction by User Input¶

Enter your own data to test the model:
There was an error when executing cell [54]. Please run Voilà with --show_tracebacks=True or --debug to see the error message, or configure VoilaConfiguration.show_tracebacks.